Search CORE

Nonparametric rank based estimation of bivariate densities given censored data conditional on marginal probabilities

Author: AB Owen
C Liu
DR Cox
DY Lin
EL Kaplan
GA Satten
K Chen
M Zhou
MG Akritas
RD Gill
RL Prentice
W Wang
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crude incidence in two-phase designs in the presence of competing risks.

Author: A Demnati
AA Tsiatis
AJ Scott
B Langholz
B Langholz
C Särndal
CE Frangakis
David V. Glidden
DG Horvitz
DV Glidden
DY Lin
DY Lin
E Geng
E Marubini
EH Geng
GA Satten
GL Anderson
J Benichou
J Neyman
JP Fine
KE Rudolph
KE Rudolph
Laura Antolini
LC de Wreede
LP Fried
M Wolkewitz
Maria Grazia Valsecchi
N Breslow
NE Breslow
NE Breslow
NP Jewell
OO Aalen
OO Aalen
Paola Rebora
R Core Team
R Franca
RJ Gray
RL Prentice
S Kang
S Kovalchik
SO Samuelsen
T Lumley
TS Lumley
W Barlow
Ø Borgan
Ø Borgan
Publication venue: eScholarship, University of California
Publication date: 01/01/2016
Field of study

BackgroundIn many studies, some information might not be available for the whole cohort, some covariates, or even the outcome, might be ascertained in selected subsamples. These studies are part of a broad category termed two-phase studies. Common examples include the nested case-control and the case-cohort designs. For two-phase studies, appropriate weighted survival estimates have been derived; however, no estimator of cumulative incidence accounting for competing events has been proposed. This is relevant in the presence of multiple types of events, where estimation of event type specific quantities are needed for evaluating outcome.MethodsWe develop a non parametric estimator of the cumulative incidence function of events accounting for possible competing events. It handles a general sampling design by weights derived from the sampling probabilities. The variance is derived from the influence function of the subdistribution hazard.ResultsThe proposed method shows good performance in simulations. It is applied to estimate the crude incidence of relapse in childhood acute lymphoblastic leukemia in groups defined by a genotype not available for everyone in a cohort of nearly 2000 patients, where death due to toxicity acted as a competing event. In a second example the aim was to estimate engagement in care of a cohort of HIV patients in resource limited setting, where for some patients the outcome itself was missing due to lost to follow-up. A sampling based approach was used to identify outcome in a subsample of lost patients and to obtain a valid estimate of connection to care.ConclusionsA valid estimator for cumulative incidence of events accounting for competing risks under a general sampling design from an infinite target population is derived

eScholarship - University of California

Accounting for Population Stratification in Practice: A Comparison of the Main Strategies Dedicated to Genome-Wide Association Studies

Author: AL Price
AL Price
B Devlin
B Peng
C Lee
C Li
C Tian
C Wu
Christophe Ambroise
D Wang
DA Hinds
DE Reich
DJ Balding
E Setakis
F Zhang
G Kimmel
GA Heiman
GA Satten
GP Page
H Zhao
HM Kang
HS Chen
HW Deng
J Marchini
J Wu
JK Pritchard
JK Pritchard
JK Pritchard
JP Ioannidis
JS Barnholtz-Sloan
KF Cheng
LR Cardon
M Chadeau-Hyam
M Li
Matthieu Bouaziz
MC Whitlock
Mickael Guedj
ML Freedman
MP Epstein
N Patterson
Q Li
S Wright
T Dadd
Thomas Mailund
W Guan
Publication venue: Public Library of Science
Publication date: 01/01/2011
Field of study

Genome-Wide Association Studies are powerful tools to detect genetic variants associated with diseases. Their results have, however, been questioned, in part because of the bias induced by population stratification. This is a consequence of systematic differences in allele frequencies due to the difference in sample ancestries that can lead to both false positive or false negative findings. Many strategies are available to account for stratification but their performances differ, for instance according to the type of population structure, the disease susceptibility locus minor allele frequency, the degree of sampling imbalanced, or the sample size. We focus on the type of population structure and propose a comparison of the most commonly used methods to deal with stratification that are the Genomic Control, Principal Component based methods such as implemented in Eigenstrat, adjusted Regressions and Meta-Analyses strategies. Our assessment of the methods is based on a large simulation study, involving several scenarios corresponding to many types of population structures. We focused on both false positive rate and power to determine which methods perform the best. Our analysis showed that if there is no population structure, none of the tests led to a bias nor decreased the power except for the Meta-Analyses. When the population is stratified, adjusted Logistic Regressions and Eigenstrat are the best solutions to account for stratification even though only the Logistic Regressions are able to constantly maintain correct false positive rates. This study provides more details about these methods. Their advantages and limitations in different stratification scenarios are highlighted in order to propose practical guidelines to account for population stratification in Genome-Wide Association Studies

HAL Evry

HAL Descartes

ProdInra

Joint Analysis for Genome-Wide Association Studies in Family-Based Designs

Author: AD Skol
AL Price
B Devlin
B Devlin
DE Reich
DJ Schaid
ES Lander
F Sun
GA Satten
H Bickeboller
H Qin
H Wang
HM Kang
HS Chen
I Ionita-Laza
JK Pritchard
JK Pritchard
JM Satagopan
Joaquín Dopazo
KV Steen
M Bauchet
M Li
PC Sham
Qiuying Sha
RS Spielman
RS Spielman
S Zhang
Shuanglin Zhang
SL Zhang
T Feng
WC Knowler
X Gao
X Zhu
X Zhu
Z Zhang
Zhaogong Zhang
Publication venue: Public Library of Science
Publication date: 01/01/2011
Field of study

In family-based data, association information can be partitioned into the between-family information and the within-family information. Based on this observation, Steen et al. (Nature Genetics. 2005, 683–691) proposed an interesting two-stage test for genome-wide association (GWA) studies under family-based designs which performs genomic screening and replication using the same data set. In the first stage, a screening test based on the between-family information is used to select markers. In the second stage, an association test based on the within-family information is used to test association at the selected markers. However, we learn from the results of case-control studies (Skol et al. Nature Genetics. 2006, 209–213) that this two-stage approach may be not optimal. In this article, we propose a novel two-stage joint analysis for GWA studies under family-based designs. For this joint analysis, we first propose a new screening test that is based on the between-family information and is robust to population stratification. This new screening test is used in the first stage to select markers. Then, a joint test that combines the between-family information and within-family information is used in the second stage to test association at the selected markers. By extensive simulation studies, we demonstrate that the joint analysis always results in increased power to detect genetic association and is robust to population stratification

CiteSeerX

Michigan Technological University

Assessment of BED HIV-1 Incidence Assay in Seroconverter Cohorts: Effect of Individuals with Long-Term Infection and Importance of Stable Incidence

Author: A Welte
Bernard M. Branson
Bharat Parekh
BS Parekh
BS Parekh
Dale J. Hu
DJ Hu
FH Priddy
GA Satten
GW Rutherford
J. Steven McDougal
Janet M. McNicholl
Jordan W. Tappero
JS McDougal
JS McDougal
JW Hargrove
K Kana
L Calzavara
Landon Myer
M Ackers
Michael Martin
MP Busch
MS Killian
P Pitisuttithum
Philip A. Mock
Punneeporn Wasinrapee
R Brookmeyer
R Guy
RL Stoneburner
RS Janssen
S Gregson
S Subbarao
S Vanichseni
S Vanichseni
T Dobbs
T Nopkesorn
T Nopkesorn
T Rehle
T Saidel
TA Kellogg
TB Hallet
Timothy A. Green
Publication venue: Public Library of Science
Publication date: 04/03/2011
Field of study

BACKGROUND: Performance of the BED assay in estimating HIV-1 incidence has previously been evaluated by using longitudinal specimens from persons with incident HIV infections, but questions remain about its accuracy. We sought to assess its performance in three longitudinal cohorts from Thailand where HIV-1 CRF01_AE and subtype B' dominate the epidemic. DESIGN: BED testing was conducted in two longitudinal cohorts with only incident infections (a military conscript cohort and an injection drug user cohort) and in one longitudinal cohort (an HIV-1 vaccine efficacy trial cohort) that also included long-term infections. METHODS: Incidence estimates were generated conventionally (based on the number of annual serocoversions) and by using BED test results in the three cohorts. Adjusted incidence was calculated where appropriate. RESULTS: For each longitudinal cohort the BED incidence estimates and the conventional incidence estimates were similar when only newly infected persons were tested, whether infected with CRF01_AE or subtype B'. When the analysis included persons with long-term infections (to mimic a true cross-sectional cohort), BED incidence estimates were higher, although not significantly, than the conventional incidence estimates. After adjustment, the BED incidence estimates were closer to the conventional incidence estimates. When the conventional incidence varied over time, as in the early phase of the injection drug user cohort, the difference between the two estimates increased, but not significantly. CONCLUSIONS: Evaluation of the performance of incidence assays requires the inclusion of a substantial number of cohort-derived specimens from individuals with long-term HIV infection and, ideally, the use of cohorts in which incidence remained stable. Appropriate adjustments of the BED incidence estimates generate estimates similar to those generated conventionally

Genetic risk factors for cerebrovascular disease in children with sickle cell disease: design of a case-control association study and genomewide screen

Author: A Collins
A Hassan
A Solovey
Abdullah Kutlar
AM Hurlet-Jensen
Betsy Clair
DA Lane
DB Goldstein
Donald Brambilla
DR Powars
ER Powars
Ferdane Kutlar
GA Satten
Gaye T Adams
Harold Snieder
HT Tabor
JF Meschia
JK Pritchard
K Ohene-Frempong
KG Ardlie
KH Buetow
L Brass
L Kruglyak
LR Cardon
M Steinberg
M Werner
MP Heaton
N Rish
National Bioethics Advisory Commission
P Sham
RJ Adams
RJ Adams
RJ Adams
RJ Adams
RJ Adams
Robert J Adams
S Wacholder
SB Gabriel
U Nowak-Gottl
Virgil C McKie
WA Schroeder
Y Hattori
YMM Bishop
Publication venue: BioMed Central
Publication date: 01/01/2003
Field of study

BACKGROUND: The phenotypic heterogeneity of sickle cell disease is likely the result of multiple genetic factors and their interaction with the sickle mutation. High transcranial doppler (TCD) velocities define a subgroup of children with sickle cell disease who are at increased risk for developing ischemic stroke. The genetic factors leading to the development of a high TCD velocity (i.e. cerebrovascular disease) and ultimately to stroke are not well characterized. METHODS: We have designed a case-control association study to elucidate the role of genetic polymorphisms as risk factors for cerebrovascular disease as measured by a high TCD velocity in children with sickle cell disease. The study will consist of two parts: a candidate gene study and a genomewide screen and will be performed in 230 cases and 400 controls. Cases will include 130 patients (TCD ≥ 200 cm/s) randomized in the Stroke Prevention Trial in Sickle Cell Anemia (STOP) study as well as 100 other patients found to have high TCD in STOP II screening. Four hundred sickle cell disease patients with a normal TCD velocity (TCD < 170 cm/s) will be controls. The candidate gene study will involve the analysis of 28 genetic polymorphisms in 20 candidate genes. The polymorphisms include mutations in coagulation factor genes (Factor V, Prothrombin, Fibrinogen, Factor VII, Factor XIII, PAI-1), platelet activation/function (GpIIb/IIIa, GpIb IX-V, GpIa/IIa), vascular reactivity (ACE), endothelial cell function (MTHFR, thrombomodulin, VCAM-1, E-Selectin, L-Selectin, P-Selectin, ICAM-1), inflammation (TNFα), lipid metabolism (Apo A1, Apo E), and cell adhesion (VCAM-1, E-Selectin, L-Selectin, P-Selectin, ICAM-1). We will perform a genomewide screen of validated single nucleotide polymorphisms (SNPs) in pooled DNA samples from 230 cases and 400 controls to study the possible association of additional polymorphisms with the high-risk phenotype. High-throughput SNP genotyping will be performed through MALDI-TOF technology using Sequenom's MassARRAY™ system. DISCUSSION: It is expected that this study will yield important information on genetic risk factors for the cerebrovascular disease phenotype in sickle cell disease by clarifying the role of candidate genes in the development of high TCD. The genomewide screen for a large number of SNPs may uncover the association of novel polymorphisms with cerebrovascular disease and stroke in sickle cell disease

Springer - Publisher Connector

Population Substructure and Control Selection in Genome-Wide Association Studies

Author: AD Skol
AL Price
AL Price
B Devlin
B Devlin
C Tian
CD Campbell
CL Pfaff
CS Carlson
CS Carlson
David J. Hunter
DC Thomas
DE Reich
DJ Hunter
DJ Hunter
DR Cox
G Thomas
GA Satten
Gilles Thomas
J Marchini
JK Pritchard
Kai Yu
KE Hutchison
LR Cardon
M Yeager
ML Freedman
MP Epstein
MS McPeek
N Patterson
Q Li
Qizhai Li
Robert N. Hoover
S Wacholder
S Wacholder
S Wacholder
S Wacholder
Sholom Wacholder
Stephen Chanock
Thorkild I. A. Sorensen
W Kruskal
X Zhu
Y Wang
Zhaoming Wang
Publication venue: Public Library of Science
Publication date: 01/07/2008
Field of study

Determination of the relevance of both demanding classical epidemiologic criteria for control selection and robust handling of population stratification (PS) represents a major challenge in the design and analysis of genome-wide association studies (GWAS). Empirical data from two GWAS in European Americans of the Cancer Genetic Markers of Susceptibility (CGEMS) project were used to evaluate the impact of PS in studies with different control selection strategies. In each of the two original case-control studies nested in corresponding prospective cohorts, a minor confounding effect due to PS (inflation factor λ of 1.025 and 1.005) was observed. In contrast, when the control groups were exchanged to mimic a cost-effective but theoretically less desirable control selection strategy, the confounding effects were larger (λ of 1.090 and 1.062). A panel of 12,898 autosomal SNPs common to both the Illumina and Affymetrix commercial platforms and with low local background linkage disequilibrium (pair-wise r2<0.004) was selected to infer population substructure with principal component analysis. A novel permutation procedure was developed for the correction of PS that identified a smaller set of principal components and achieved a better control of type I error (to λ of 1.032 and 1.006, respectively) than currently used methods. The overlap between sets of SNPs in the bottom 5% of p-values based on the new test and the test without PS correction was about 80%, with the majority of discordant SNPs having both ranks close to the threshold. Thus, for the CGEMS GWAS of prostate and breast cancer conducted in European Americans, PS does not appear to be a major problem in well-designed studies. A study using suboptimal controls can have acceptable type I error when an effective strategy for the correction of PS is employed

CiteSeerX

Harvard University - DASH

How to handle mortality when investigating length of hospital stay and time to clinical stability

Author: A Allignol
A Allignol
A Allignol
A Latouche
B Efron
CE Frangakis
Christopher Barnes
D Cox
DB Rubin
EL Kaplan
F Arnold
FW Arnold
FW Arnold
FW Arnold
GA Satten
Guy N Brock
H Putter
J Beyersmann
J Beyersmann
J Beyersmann
J Beyersmann
J Bordon
J Klein
JD Kalbfleisch
JL Zhang
JL Zhang
John Myers
JP Fine
JP Klein
JP Klein
Julio A Ramirez
L Lan
L Scrucca
LC de Wreede
M Wolkewitz
MJ Fine
ML Volk
MS Niederman
MS Pepe
N Ferguson
N Grambauer
NE Breslow
O Aalen
PK Anderson
R Development Core Team
R Gray
R Gray
R Menendez
RB Geskus
S Datta
S Fishbane
SH Silber
SM Bentzen
TM Braun
X Zhang
Y Shindo
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background Hospital length of stay (LOS) and time for a patient to reach clinical stability (TCS) have increasingly become important outcomes when investigating ways in which to combat Community Acquired Pneumonia (CAP). Difficulties arise when deciding how to handle in-hospital mortality. Ad-hoc approaches that are commonly used to handle time to event outcomes with mortality can give disparate results and provide conflicting conclusions based on the same data. To ensure compatibility among studies investigating these outcomes, this type of data should be handled in a consistent and appropriate fashion. Methods Using both simulated data and data from the international Community Acquired Pneumonia Organization (CAPO) database, we evaluate two ad-hoc approaches for handling mortality when estimating the probability of hospital discharge and clinical stability: 1) restricting analysis to those patients who lived, and 2) assigning individuals who die the "worst" outcome (right-censoring them at the longest recorded LOS or TCS). Estimated probability distributions based on these approaches are compared with right-censoring the individuals who died at time of death (the complement of the Kaplan-Meier (KM) estimator), and treating death as a competing risk (the cumulative incidence estimator). Tests for differences in probability distributions based on the four methods are also contrasted. Results The two ad-hoc approaches give different estimates of the probability of discharge and clinical stability. Analysis restricted to patients who survived is conceptually problematic, as estimation is conditioned on events that happen <it>at a future time</it>. Estimation based on assigning those patients who died the worst outcome (longest LOS and TCS) coincides with the complement of the KM estimator based on the subdistribution hazard, which has been previously shown to be equivalent to the cumulative incidence estimator. However, in either case the time to in-hospital mortality is ignored, preventing simultaneous assessment of patient mortality in addition to LOS and/or TCS. The power to detect differences in underlying hazards of discharge between patient populations differs for test statistics based on the four approaches, and depends on the underlying hazard ratio of mortality between the patient groups. Conclusions Treating death as a competing risk gives estimators which address the clinical questions of interest, and allows for simultaneous modelling of both in-hospital mortality and TCS / LOS. This article advocates treating mortality as a competing risk when investigating other time related outcomes.</p

Springer - Publisher Connector

A Robust Statistical Method for Association-Based eQTL Analysis

Author: AL Dixon
AL Price
B Devlin
C Ouyang
Christine Hackett
CJ Hoggart
David Marshall
DH Alexander
DJ Balding
DJ Schaid
DJ Schaid
DL Remington
EE Schadt
ES Lander
GA Satten
GW Snedecor
HC Fung
HM Kang
I Mackay
J Cockram
J Couzin
J Peng
J Simón-Sánchez
J Yu
JK Pritchard
JK Pritchard
KG Ardlie
KM Weiss
Lin Wang
Lindsey Leach
LR Cardon
LR Cardon
M Morley
M Slatkin
MH Wang
MI McCarthy
Minghui Wang
MM Iles
Momiao Xiong
N Hubner
N Patterson
Ning Jiang
NJ Risch
NJ Risch
NL Johnson
PH Westfall
R Chakraborty
R McGinnis
RS Spielman
RS Spielman
S Campino
Tianye Jia
VG Cheung
VG Cheung
W Astle
W Satake
WJ Ewens
X Zhu
YT Wang
Z Luo
ZB Zeng
Zewei Luo
Publication venue: Public Library of Science
Publication date: 01/01/2011
Field of study

Background: It has been well established that theoretical kernel for recently surging genome-wide association study (GWAS) is statistical inference of linkage disequilibrium (LD) between a tested genetic marker and a putative locus affecting a disease trait. However, LD analysis is vulnerable to several confounding factors of which population stratification is the most prominent. Whilst many methods have been proposed to correct for the influence either through predicting the structure parameters or correcting inflation in the test statistic due to the stratification, these may not be feasible or may impose further statistical problems in practical implementation. Methodology: We propose here a novel statistical method to control spurious LD in GWAS from population structure by incorporating a control marker into testing for significance of genetic association of a polymorphic marker with phenotypic variation of a complex trait. The method avoids the need of structure prediction which may be infeasible or inadequate in practice and accounts properly for a varying effect of population stratification on different regions of the genome under study. Utility and statistical properties of the new method were tested through an intensive computer simulation study and an association-based genome-wide mapping of expression quantitative trait loci in genetically divergent human populations. Results/Conclusions: The analyses show that the new method confers an improved statistical power for detecting genuin

CiteSeerX

University of Birmingham Research Portal